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Kethod for coding astd decoding ixnpulso responses of audio 
signals 

The invention relates to a method and to an appara-tus for 
5 coding and decoding impulse responses of audio signals, es- 
pecially for describing the presentation of soimd sources 
encoded as audio objects according to the MPEG-4 Audio stan- 
dard, 

10 Background 

MPBG-4 as defined in the MPEG-4' Audio standard ISO/IEC 
14496-3 and the MPEG-4 Systems standard 1449S-1 facilitates 
a wide variety of applications by supporting the represanta- 
as tion of audio objects. For the combination of the audio ob- 
jects additional information - the so-called scene descrip- 
tion - determines the placement in space and time and is 
transmitted together with the coded audio objects- 

20 For playback the audio objects are decoded separately and 
composed using the scene description in order to prepare a 
single soundtrack, which is then played to the listener. 

For efficiency, the MPEG-4 Systems standard ISO/IEC 14496'-1 
25 defines a way to encode the scene description in a binary 
representation, the so-called Binary Format for Scene De- 
scription (BIPS) . Correspondingly, audio scenes are de- 
scribed using so-called AudioBIPS. 

30 A scene description is structured hierarchically and can be 
represented as a graph, wherein leaf-nodes of the graph fbrm 
the separate objects and the other nodes describes the proc- 
essing, e.g- positioning, scaling, effects etc.. The appear- 
ance and behavior of the separate objects can be controlled 

35 using parameters within the scene description nodes. 
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Invention 



The invention is based on the recognition of the following 
facts - 

5 

The transmission" and use of real, i, e. of measured^ room 
inpulse responses for the reproduction of sound signals with 
this room characteristic has been the object of research and 
development workings for years. Here exist problems with re- 
10 spect to the measurement, post-procesaing, traixsmission and 
the subsequent presentation in the desired listening room. 

In the context of the MPEG-4 activities the possibilities of 
transmission and use of the impulse responses were eacamined 
IS in the frame' of the European research project ^^Carruso". 
Here, the transmission of long impulse responses turned out 
to be the main problem. There e^cist three basic problems ': 

The necessary length of the pulse responses leads to the 
20 problem, that they, cannot directly be transmitted as parame- 
ter with the field-update -mechanism of MPEG- 4. 
The transmission with usual coding methods like AAC or MPEG- 
2 /Layer 3 Cmp3> results - because of the use of psycho 
acoustic compression methods - in falsification of the pulse 
25 response » 

An update-mechanism, which uses the method according to 2 , , 
requires sending the pulse responses in the broadcast mode. 
The bandwidth necessary for this exceeds - the bandwidth 
available for the transport of the media data. Therefore, an 
3 0 update of the data in time cannot be guaranteed. 

The present invention describes a mechanism, with which the 
above-mentioned problems can be solved: 

35 The invention uses multiple successive field updates for the 
pararas [128] -field, in order to make complex system parameter 
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(e. g. system-pulse response) usable 

params [12 8] -field contains information about number and con- 
tent of the following fields. This represents an extensior 
of the field updates, which ia - by default - performed with 
only one par aras [128] -field. The transmission of data of any 
length is made possible. These data can then be stored in an 
additional memory of a node and they can be used during the 
calculation of the effect. In principle, it is also possible 
to replace or atnend/ respectively, only certain parts of the 
field during operation, in order to keep the number of 
transmitted data a small as possible. 

Exemplary embodiment 

AudloFXParoto 
Node interface 
AudioFxProto { 

PROTO audio "Name " [ 



©xposedField 


MFNode 


AudioFXChUdren 


[] 




exposedField 


WFFIoat 


audioFXParams 


[] 




exposedFle!d 


SFInt32 


aadioFXnumChannel 


1 




exposedField 


MFInt32 


andioFXPhaseGroup 


[1 




1 

DEF "Name" AudioFX ( 


sventin 


MFNode 


addChildren 






eventin 


MFNode 


removeCMtdren 






exposedField 


MFNode 


children 


[] 


children iS audioFXCmidren 


exposedF^ield 


SFCommand 
Buffer 


orch 


[3 


only used in players wiQi SA, cap 


exposedField 


SFCommand 
Buffer 


score 


[] 


only used in players with S.A, cap 


exposedField 


IVIFFloat 


params 


[1 


params[128] IS audioFXParams 


.fteld 


SFIat32 


numChannel 


1 


numCfiannel IS audloFXIsIumCh. 


Held 


MFtnt32 


phaseGroap 


[3 


phaseGroup IS audioFXPhaseG 



} 
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Functionality and sexoanties 

The AudloFxPROTO node provides an alternative to tiie AudioPX 
node. It is taylored to consumer products and allows players 
5 without Structured Audio capability to . use basic audio ef- 
fects. The PROTO shall encapsulate the AudloFX node, so that 
enhanced MPEG 4 players with Structured Audio capability can 
decode the SAOL (Structured Audio Orchestra Language - for- 
mat for the description of instrximents) resp. SASL (Struc- 
xo tured Audio Score Language - format for the description of 
scores) token streams directly, Sii^pler consumer players 
shall only identify the effects and stetrt them from inteamal 
effect representations, if available. 

IS The description of the fields can be found in the descrip- 
tion of the AudioFZ node iii the MPBO-4 Systems standard 
ISO/IEC 14496-1 , 9,4.2-10. In short , the addChildren eventin 
specifies nodes that shall be added to the children .field, 
while th^ removeChildren eventin specifies nodes that shall be 

20 removed from the children field. The children array contains 
the nodes operated upon by this effect- The orch string con- 
tains a tokenised block of signal -processing code written in 
SAOL. The score string may contain a tokenized score for the 
given orchestra written in SASL. The params field allows 

25 BIF.S commands and events to affect the sound -generation 
process in the orchestra- The numchan field specifies the 
number of channels of audio output by this node. The piia- 
seGroup field specifies the phase relationships among the 
various output channels. 

30 

The BIFS encoder does not encode the protoName string di- 
rectly except if MPEG-J- is used, USENAMES keyword - but 
replaces it by a PROTO-XD. This ID will normally be gener- 
ated automatically during the encoding of the BIFS^stream. 
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The identification of fixed effects inside a consumer playe- 
r-equires reseorved IDs. These IDs should - for practical rea 
sons - be attached to the corresponding protoKTame strings i: 
a special namespace. 

The number of possible PROTO-IDs is encoded in the s-bit 
variable PROTOIDbits in the BIPSv2Config blags. This enables 
the player to use a maximum of 2^= IDs. Reserved ids shall be 
located in the "IDs reserved for ISO use" space. For the 
corresponding protoName strings the JAVA naming convention 
shall be used. 

For reasons of expandability, further ID space should be re- 
served in a space "IDs reserved for private use", defined by 
levels. The preserved space could be used by the industry to 
define their own proprietary nodes. 

The protoName strings shall be replaced by their f±xed 
PROTO-lDs in the BIPS encoding process. In . case .of decoding 
with a consumer MPE<5 4 player, the occurence of these IDs 
shall cause the BIPS decoder to instantiate the correspond- 
ing PROTOs.from the matching internal PROTO Effect Nodes. 

For a limited number of audio effects, called standard .ef- 
fects, a Structured Audio code shall, to be defined that 
works directly in enhanced players and is a reference for 
player-intemal implementations. This may be an argument to 
keep the number of Standard Effects low. 

The following standard Effects shall be defined as FX nodes 
and encapsulated in protos s 

Echo, Filter, Stereo-Base, Virtual Stereo, Equalizer, com- 
pressor. Reverb, naturalReverb, Chorus, Flange and Speed- 
Change (requires media control or (Advanced) Audi ©Buffer ) . 
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6 



PROTO audioNafcuralReverb 

The audioNaturalReverb contains the following parameters: 
First pararasC 1 field: 





float 


numParamsRekls 


1 


1 .,60000 




float 


rtumlmpResp 


0 


0,-32 




float 1 


sampieRatB 








noatn 


teverbChaanels 


0 


0,1,2,3 31 




float 


impulseResponseCoding 


0 


0..1 










reserved 


Following params [ ] fields : 




float 


ImpulsBRBsponssLenath 


0 


240000 * 






impulseResponso 




* 










* numtmpResp times 



The NatiuralReverb PROTO uses the impulse responses of dif- 
feiTCTit sound channels to create a reverberation effect. 
Since these impulse responses can be very long (several sec- 
onds for a big church or hall) , one params [ ] array is not 
sufficient to transmit the complete data set. Therefore, a 
bulk of consecutive params [ ] arrays is used in the follow- 
ing ways 

The first block of params [ ] contains information about the 
following params [ ] fields: 

The numParamaPields field determines the number of following 
params [ ] fields to be used. The NaturalReverb PROTO has to 
provide sufficient memory to store these fields. 

The numlmpResp defines the number of impulse responses 
number of channels uped for* reverberation) . It must be 
smaller than audioPXnumChannel in the AudioFX PROTO node in- 



EfOPfaasszeit 2.Dez. 14:28 



PD030121*IPA-Ri-021203 



terf ace . 

The rBvorbCSiannels field defines the mapping of the impulse 
responses to the input channels. 

The impuXsBRoBponse^Codxncr field shows how the impialse re- 
sponse is coded (see table below) . 



10 Case I can be useful to reduce the length of sparse impulse 
responses ♦ 

The fields shall map to the first params [ 3 array as fol- 
lows : 





5= 




tol 


jxuznfiavcban 


S 


params 


til 






params 


[23 




-11 = 


params 


[3 ... 3 maBRmvCban 




9S 


params 


[3 +22t]2fiRotrC2taal 



The following params [ ] fields contain the nvunlmpR&sp con- 
secutive impulse responses as follows: 

25 The xmpulseRGsponseLGngth gives the length of the following 
impuI^eJKespozise • 

The IxnpulsGRospans&Xiengizh and the impiiIseiSesponse are re- 
peated numXmpjRe^p times, 

30 

The fields shall map to the following params! 3 arrays as 
follows: 

iit^u2saRejS[poasdZiez20t± ^params [0] 





consecuth/e samples 



sample^^1^mber/$ampla 
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The exact method of calculating the reverberation according 
to the specified parameters is not normative. 

5 

The output shall be the reverberated sound signal. 
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Claims 

1. Method for coding itnpulse responses of audio signals, 
wherein said impulse responses allow the reproduction 
of sound signals corresponding to a certain room char- 
acteristic , comprising : 

generating, an impulse responses of a .sound source; 

and 

inserting parameters representing said generated 
impulse responses in multiple successive field updates 
for the paratns [1283 -field, wherein a first paramjs [128] - 
field contains information about the number and content 
of the following fields, 

2. Method for decoding impulse responses of audio signals, 
wherein said iir^ulse responses allow the repx:oduction 
of sound signals corresponding to a certain room char- 
acteristic , comprising : 

separating parameters representing impulse responses 
from multiple successive field updates for the 
par amis [12 B] -field, wherein a first params [128] -field 
contains information about the number and content of the 
following fields; 

storing the separated parameters in an additional 
memory of a node," and 

using said stored parameters during the calculation 
of the room characteristic. 

3 . Apparatus for performing a method according to claim l - 
or 2. 
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Abstract 



In the context of the MPEG-4 activities tha problem of 
transmitting long inpulse responses is solved by inserting 
parameters representing said generated impulse responses in 
multiple successive field updates for the params [1281 -field, 
wherein a first params [128] -field contains information about 
the number and content of the following fields. 



Empfangszeit 2.Dez. 14:28 



